Skip to content

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Jul 8, 2025

This refactors our ValuesSourceReaderOperator so it can split pages when it reads large values. It does not actually split the pages as that's a bit tricky. But it sets the stage for the next PR that will do so.

  • Move ValuesSourceReaderOperator to it's own package
  • Move many inner classes into their own top level classes
  • Extend from AbstractPageMappingToIteratorOperator instead of AbstractPageMappingToOperator
    • This allows returning more than one Page per input Page
    • In this PR we still always return one Page per input Page
    • Make new ReleasableIterator subclasses to satisfy AbstractPageMappingToIteratorOperator
    • Change status of loading fields from pages_processed to pages_received and pages_emitted
  • Fix a bug in AbstractPageMappingToOperator which can leak circuit breaker allocation if we fail to during receive. This isn't possible in the existing implementations but is possible in ValuesSourceReaderOperator.
  • Add a test with large text fields. Right now it still comes back in one page because we don't cut the pages.

Closes #130727

This refactors our `ValuesSourceReaderOperator` so it can split pages
when it reads large values. It does not *actually* split the pages as
that's a bit tricky. But it sets the stage for the next PR that will
do so.

* Move `ValuesSourceReaderOperator` to it's own package
* Move many inner classes into their own top level classes
* Extend from `AbstractPageMappingToIteratorOperator` instead of
  `AbstractPageMappingToOperator`
  * This allows returning more than one `Page` per input `Page`
  * In this PR we still always return one `Page` per input `Page`
  * Make new `ReleasableIterator` subclasses to satisfy
    `AbstractPageMappingToIteratorOperator`
  * Change `status` of loading fields from `pages_processed` to
    `pages_received` and `pages_emitted`
* Fix a bug in `AbstractPageMappingToOperator` which can leak circuit
  breaker allocation if we fail to during `receive`. This isn't possible
  in the existing implementations but is possible
  in `ValuesSourceReaderOperator`.
* Add a test with large text fields. Right now it still comes back in
one page because we don't cut the pages.

Closes elastic#130727
@nik9000
Copy link
Member Author

nik9000 commented Jul 8, 2025

I'll merge #130838 into this to fix the double-close issues we're seeing.

nik9000 added 3 commits July 8, 2025 16:10
Fixes a bug during field loading where we could double-close blocks if
we failed to allocate memory during the un-shuffling portion of field
loading from single segments.

Unit test incoming in the followup.

Closes elastic#130426
Closes elastic#130790
Closes elastic#130791
Closes elastic#130792
Closes elastic#130793
Closes elastic#130270
Closes elastic#130788
Closes elastic#130122
Closes elastic#130827
@nik9000 nik9000 merged commit 4dae263 into elastic:8.19 Jul 8, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants